For better structuring a Python project, I decide to learn of packages, modules and classes. This article also shows how to import all modules and classes in a directory.
1. Packages, modules and classes
A package includes a collection of modules (a module is simply a Python source file *.py
) that can expose classes, functions and global variables.
1.1 Packages
To create a package, put __init__.py
(can just be an empty file) into a directory (say lib/
) to make Python treat lib
as a package. For instance, a collection of frequently used modules are grouped into a package lib
, as listed below in a hierarchical tree structure.
$ tree -P '*.py' .
.
├── main.py
├── lib
│ ├── __init__.py # treat this directory as a package
│ ├── analyzegraph.py # a Python module
│ ├── autovivification.py
│ ├── connectioneventsgenerator.py
│ ├── debugtraces.py
│ ├── distancepoints.py
│ ├── dominatingsets.py
│ ├── messageeventsgenerator.py
│ ├── output.py
│ ├── plotgraph.py
│ ├── processdatasets.py
│ ├── readgtfs.py
│ ├── routetable.py
│ └── typeconverter.py
from package import item
, the item can be either a submodule/subpackage of the package, or some other name defined in the package, like a function, class or variable.import item.subitem.subsubitem
, each item except for the last must be a package; the last item can be a module or a package but can’t be a class or function or variable defined in the previous item.
from item.subitem import subsubitem
# is equivalent to
import item.subitem.subsubitem
subsubitem = item.subitem.subsubitem
1.2 Modules
A module is simply a Python source file (modulename.py
) containing Python definitions and statements.
The built-in function dir([object])
returns a sorted list of
dir()
names that the module defines, but does not list the names of built-in functions and variables.dir(a module object)
names of the module's attributesdir(a class object)
names of the class' attributes and recursively of the attributes of its bases.dir(otherwise)
names of the object’s attributes, its class’s attributes, and recursively the attributes of its class’s base classes.
>>> from lib import *
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__', 'analyzegraph', 'autovivification', 'connectioneventsgenerator', 'debugtraces', 'distancepoints', 'dominatingsets', 'messageeventsgenerator', 'output', 'plotgraph', 'processdatasets', 'readgtfs', 'routetable', 'typeconverter']
# a module object
>>> dir(dominatingsets)
['DominatingSets', 'LABEL_CANDICATE_SETS', 'LABEL_CDS', 'LABEL_NAME', 'LABEL_NOT_CDS', 'OrderedDict', '__builtins__', '__doc__', '__file__', '__name__', '__package__', 'copy', 'nx', 'nxaa', 'plt']
# a class object
>>> dir(dominatingsets.DominatingSets)
['__doc__', '__module__', 'get_connected_dominating_sets_greedily', 'get_dominating_sets']
The contents of dominatingsets.py
:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
import networkx as nx
import networkx.algorithms.approximation as nxaa
from collections import OrderedDict
import copy
LABEL_NAME = 'node_label_cds'
LABEL_CDS = 'Y'
LABEL_CANDICATE_SETS = 'NG'
LABEL_NOT_CDS = 'N'
class DominatingSets:
@classmethod
def get_dominating_sets(cls, G, weight=None):
@classmethod
def get_connected_dominating_sets_greedily(cls, G, weight='degree'):
1.3 Classes
Python’s class mechanism is a mixture of the class mechanisms found in C++ and Modula-3.
class MyClass:
class_variable = 'class variable' # class variable shared by all instances
def __init__(self, name):
self.name = name # instance variable unique to each instance
1.4 Load all modules
I expect to load all modules under lib
into main.py
by from lib import *
.
#!/usr/bin/env python
from lib import * # load all modules
dominating = dominatingsets.DominatingSets()
When running the code from lib import *
(call lib/__init__.py
implicitly), a list of module names __all__
defined in lib/__init__.py
are imported to the current namespace. Therefore, __all__
should be specified to load all modules. The contents of lib/__init__.py
are as follows.
from os.path import dirname, basename, isfile
import glob
modules = glob.glob(dirname(__file__) + "/*.py")
__all__ = [basename(f)[:-3] for f in modules if isfile(f) and not basename(f).startswith('__')] # exclude __init__.py
1.5 Load all classes
In section 1.4, import lib from *
loads the modules, but not the classes. Therefore, to access a class, the class should be prefixed with the module name separated by a dot, e.g., dominatingsets.DominatingSets()
instead of DominatingSets()
. Refer to [4] to import all classes, (PS: it doesn't work for me)
import os, sys
path = os.path.dirname(os.path.abspath(__file__))
for py in [f[:-3] for f in os.listdir(path) if f.endswith('.py') and f != '__init__.py']:
mod = __import__('.'.join([__name__, py]), fromlist=[py])
classes = [getattr(mod, x) for x in dir(mod) if isinstance(getattr(mod, x), type)]
for cls in classes:
setattr(sys.modules[__name__], cls.__name__, cls)
2. Naming conventions
As described in PEP 0008 -- Style Guide for Python Code,
- Packages should have short, all-lowercase names, although the use of underscores is discouraged.
- Modules should also have short, all-lowercase names. Underscores can be used in the module name if it improves readability.
- Classes should normally use the CapWords convention.
PS: When an extension module written in C or C++ has an accompanying Python module that provides a higher level (e.g. more object oriented) interface, the C/C++ module has a leading underscore (e.g. _socket
).
References:
[1]StackOverflow: How do I load all modules under a subdirectly in Python?
[2]PEP 0008 -- Style Guide for Python Code: Package and Module Names
[3]StackOverflow: Loading all modules in a folder in Python
[4]StackOverflow: Import all classes in directory?
[5]Python Guide: Structuring Your Project